new test_1_week_1_day_1#1

Open

I0-OVI wants to merge 84 commits intoI0-OVI:mainfrom

Owner

I0-OVI commented Jul 23, 2025

I just use os library to check whether it is mlx or pytorch and there is a specific function and test for each case.

Andy1314Chen and others added 20 commits

June 13, 2025 17:19


          fix Rotation matri form of RoPE (#25)

080a53a


          add back installation check script

cd54459

Signed-off-by: Alex Chi <iskyzh@gmail.com>


          bump mlx to latest version (#33)

55066c3

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          test:add a test case to cover week_1_day_3_task3 (#31)

7fc05dc

* test:add a test case to cover week_1_day_3_task3

Closes: #23
Signed-off-by: Jiawei Zhao <Phoenix500526@163.com>

* fmt

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>

---------

Signed-off-by: Jiawei Zhao <Phoenix500526@163.com>
Signed-off-by: Alex Chi Z <iskyzh@gmail.com>
Co-authored-by: Alex Chi Z <iskyzh@gmail.com>


          fix tokp implementation

13295fc

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          more precision tweaks

4e9101c

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          fix bugs in continuous batching

ae06a7f

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          fix mask tests

cfbc43e

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          reshape in kvcache

5863d96

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          fix flash attention

55e7b0c

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          add back causal mask to gqa

d954fb5

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          flash attention works for the first token, maybe some mem init issue

e21a583

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          try debug flashattention on multi test run

9eacc3b

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          small fixes of flash attention

657c0b3

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          finally fully fix flash attention

8dfe61c

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          feat(kv-cache): add KV cache imports and week 2 day 1 tests (#35)

0ca2bf1

Add KV cache module imports to both tiny_llm and tiny_llm_ref packages
to enable KV cache functionality. Include comprehensive test suite for
week 2 day 1 covering embedding operations, model inference with KV
cache, and sequential token generation with offset support.

- Add KV cache imports to __init__.py files
- Create test_week_2_day_1.py with task 2-4 test coverage
- Support multiple Qwen2 model variants (0.5B, 1.5B, 7B)
- Include embedding call and as_linear functionality tests
- Add sequential generation tests with proper cache management


          refactor continuous batching

3cd7d84

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          chunked prefill only in continuous batching

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          update readme and roadmap

8b4d9a7

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          update the vllm-RoPE code link in the reading (#39)

024d528

I0-OVI closed this

I0-OVI reopened this

skyzh and others added 8 commits

August 9, 2025 16:01


          bfloat16 support for matmul

00ea990

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          model shortcut and dispatcher

850dd6c

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          qwen3 support

ffbd15d

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          update readme

cd87116

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          fix: resolve f-string syntax error in batch.py (#44)

4e1cced

Extract string replacement operation outside f-string expression
to avoid backslash in f-string expression part, which is not
allowed in Python syntax.

- Move .replace('\n', ' ') operation to separate variable
- Improves code readability and fixes SyntaxError


          remove offset in week 1, not used

1d7572f

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          add week2day1 kv cache contents

0b82b7f

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          update benches

45cff24

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>

skyzh and others added 30 commits

September 26, 2025 01:37


          ensure user solution can run

ff5d7d0

Signed-off-by: Alex Chi Z <iskyzh@gmail.com>


          add definition hint for model args

cf6910a

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          add more info

a30f9c2

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          rename

83762c8

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          Merge pull request #73 from Connor1996/model-args

8eebd4a


          fix: fix link to Qwen2.5 blog in week1 (#72)

f1f4f98

Co-authored-by: Yangchen Ye <yangchenye@Yangchens-MacBook-Pro.local>


          docs: add instruction to download Qwen2-1.5B model (#75)

cea8926

* docs: add instruction to download Qwen2-1.5B model


          perform pdm sync before running (#76)

5dc71b8

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          Fix f-string syntax (#81)

ace6e45

Extract newline character to a variable to avoid backslash in f-string expression part.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>


          fix: draft-generate offset (#83)

5b6fdc3

Signed-off-by: KKKZOZ <kkkzoz@qq.com>


          fix mx.logsumexp with the right dim (#80)

16f55c7


          feat: implement quantized_matmul with typed CPU implementation (#77)

685caf5

- Add complete quantized_matmul_impl_typed template function for CPU, which support float16, float32, and bfloat16 data types
- Add float32 test cases for quantized_matmul
- Adjust float32 tolerance in test utils for better precision


          book: remove deprecated mdbook multilingual key (#86)

c9f05de


          ci: update mdbook preprocessors for 0.5 pipeline (#87)

e34dc7e


          add AGENTS.md (#85)

0c95267

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          docs: add Week 2 Day 2-3 Quantized Matmul chapter CPU part (#88)

b2393a2

* docs: add Week 2 Day 2-3 Quantized Matmul chapter

- Add quantized matmul documentation (week2-02-quantized-matmul.md)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          docs: add Week 2 Day 2-3 Quantized Matmul chapter GPU part (#89)

0688e96

* docs: add week2 quantized matmul GPU part

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          doc: add tokenizer definition reference (#90)

1cd513b

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          docs: mark week 2.3 tiny_llm status as complete (#91)

e9b90bd


          Add bench-main command and week2 benchmark instructions (#93)

f4dc967


          fix(ref): correct attention weight shape asserts (#92)

ed8ac9e


          bugfix: way to get_kernel from library

9b40133


          Merge pull request #94 from fuyufjh/fix_metal_get_kernel

bugfix: way to get_kernel from library


          book: replace huggingface-cli with hf

5b2f184

Signed-off-by: you06 <you1474600@gmail.com>


          Merge pull request #95 from you06/doc/update-huggingface-cli

ce64300

book: replace `huggingface-cli` with `hf`


          tests: parametrize flash attention mask coverage (#96)

2ace66c


          docs: add week2 flash-attention CPU part (#97)

bb1e902

* docs: add week2 flash-attention chapter links and draft

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          docs: add week2 entries to glossary (#98)

ddfcaed

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          ref: support causal mask mode in flash attention (#99)

ff9cd45

Signed-off-by: Connor1996 <zbk602423539@gmail.com>


          docs: add a reading to week 2 flash attention (#100)

71cc7be

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet